Product Code Database
Example Keywords: the orange -tomtom $51-141
   » » Wiki: Speech Corpus
Tag Wiki 'Speech Corpus'.
Tag

Speech corpus
 (

A speech corpus (or spoken corpus) is a of speech audio files and text transcriptions. In speech technology, speech corpora are used, among other things, to create (which can then be used with a speech recognition or speaker identification engine). In , spoken corpora are used to do research into , conversation analysis, and other fields.

A corpus is one such database. Corpora is the plural of corpus (i.e. it is many such databases).

There are two types of speech corpora:

  1. Read Speech, which includes:
    • Book excerpts
    • Broadcast news
    • Lists of words
    • Sequences of numbers
  2. Spontaneous Speech, which includes:
    • Dialogs – between two or more people (includes meetings; one such corpus is the KEC);
    • Narratives – a person telling a story (one such corpus is the );
    • Map-tasks – one person explains a route on a map to another;
    • Appointment-tasks – two people try to find a common meeting time based on individual schedules.

A special kind of speech corpora are non-native speech databases that contain speech with a foreign accent.


See also

  • Edwards, Jane / Lampert, Martin (eds.) (1992): Talking Data – Transcription and Coding in Discourse Research. Hillsdale: Erlbaum.
  • Leech, Geoffrey / Myers, Greg / Thomas, Jenny (eds.) (1995): Spoken English on Computer: Transcription, Markup and Application. Harlow: Longman.


External links

Page 1 of 1
1
Page 1 of 1
1

Account

Social:
Pages:  ..   .. 
Items:  .. 

Navigation

General: Atom Feed Atom Feed  .. 
Help:  ..   .. 
Category:  ..   .. 
Media:  ..   .. 
Posts:  ..   ..   .. 

Statistics

Page:  .. 
Summary:  .. 
1 Tags
10/10 Page Rank
5 Page Refs